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ABSTRACT 

We develop an analytical formalism that is suitable for studying inhomogeneous 
structure formation, by studying the joint statistics of dark matter halos forming at 
two points. Extending the Bond et al. (1991) derivation of the mass function of virial- 
ized halos, based on excursion sets, we derive an approximate analytical expression for 
the "bivariate" mass function of halos forming at two redshifts and separated by a fixed 
comoving Lagrangian distance. Our approach also leads to a self-consistent expression 
for the nonlinear biasing and correlation function of halos, generalizing a number of 
previous results including those by Kaiser (1984) and Mo &: White (1996). We compare 
our approximate solutions to exact numerical results within the excursion-set frame- 
work and find them to be consistent to within 2% over a wide range of parameters. Our 
formalism can be used to study various feedback effects during galaxy formation ana- 
lytically, as well as to simply construct observable quantities dependent on the spatial 
distribution of objects. 

Subject headings: cosmology: theory - galaxies: formation - large-scale structure of 
universe - methods: analytical 



1. Introduction 



A critical prediction of any theory of structure formation is the mass function of virialized dark- 
matter halos. As the gravitational collapse of dark matter is thought to be the dominant force in 
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structure formation, an accurate determination of the number density of halos as a function of mass 
and redshift is a critical step towards understanding the observed abundances of galaxies, clusters, 
and other cosmological objects. 

In the study of structure formation, two main methods have emerged to evaluate this quan- 
tity: computational methods that solve the equations of gravitational collapse numerically, and 
analytical techniques that approximate these results with simple one-dimensional functions. While 
only numerical methods capture the full details of dark matter collapse, much of our understand- 
ing of structure formation relies instead on analytical techniques. As such methods are based on 
simple assumptions and are easily applied to a large range of models, they are indispensable both 
for gaining physical understanding into the numerical results and exploring the effects of model 
uncertainties. 

The most widely applied method of this type was first developed by Press & Schechter (1974). 
In this model, the abundance of halos at a redshift z is determined from the linear density field by 
applying a simple model of spherical collapse to associate peaks in this field with virialized objects 
in a full nonlinear treatment. This simple model, later refined by Bond et al. (1991), Lacey h Cole 
(1993), and others, has had great success in describing the formation of structure, reproducing the 
numerical results much more accurately than might be expected given the approximations involved. 

Yet this model is intrinsically limited since it can only predict the average number density of 
halos, without supplying any information as to their relative positions. Although this is sufficient 
for studying halo evolution, baryonic objects forming within these halos are often subject to strong 
environmental effects that are untreatable in this context. As a simple first-order approximation, 
many authors have tried to reconstruct the formation history of baryonic objects by combining 
the Press-Schechter approach with average intergalactic medium (IGM) conditions as a function of 
redshift, bathing all cosmological objects in the same UV background flux or assuming the same 
metal pre-enrichment for all galaxies. 

Many of the most important environmental effects, however, are in reality extremely inhomo- 
geneous in nature, being caused by the nonlinear structures that form within the IGM, and thus 
primarily impacting the areas near these structures. Such interactions between the IGM and struc- 
ture formation are often better described as spatially-dependent feedback loops rather than sudden 
changes in the overall average conditions. Processes of this sort include the formation of the first 
cosmological objects and the dissociation of molecular hydrogen (e.g., Haiman, Rees &: Loeb 1996; 
Haiman, Abel h Rees 2000; Ciardi, Ferrara, h Abel 2000), galaxy formation and photoevaporation 
during reionization (e.g., Efstathiou 1992; Gnedin & Ostriker 1997; Miralda-Escude h Rees 1998; 
Barkana &; Loeb 1999) and the impact of galactic outflows on the formation of neighboring objects 
(e.g., Scannapieco, Tliacker &: Davis 2001; Scannapieco & Broadhurst 2001). While a complete 
treatment of these issues can only be achieved numerically, unlike simulation of average quantities, 
simulations of structure formation in inhomogeneous environments have no analytical counterparts 
with which they can be compared. This greatly reduces the parameter space of models that can be 
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studied and leaves us without a more basic theoretical understanding that can put the simulation 
results in a broader context. 

Even when objects do not have a large effect on the formation of their neighbors, issues 
related to the spatial correlations between halos often arise when comparing theoretical predictions 
to the observed distribution of objects. Thus analytical estimates of galaxy cluster correlation 
functions (e.g., Mo, Jing, & White 1996), and of the contribution of collapsed objects to the angular 
power-spectrum of the Cosmic Microwave Background (e.g., Komatsu & Kitayama 1999; Knox, 
Scoccimarro, & Dodelson 1998; Scannapieco, Silk, k. Tan 2000), depend on supplementing the Press- 
Schechter number densities with additional approximate models. While a number of such models 
exist, along with accurate fitting formulae for both observational and numerical correlations (e.g., 
Efstathiou et al. 1988; Cole & Kaiser 1989; Mo & White 1996; Jing 1998, 1999), all such techniques 
represent the grafting of external information onto the underlying excursion-set calculation. 

In this work, wc develop an approximate analytical model that can address these issues. In- 
spired by the success of the Press h Schechter (1974) model of structure formation, we return to 
the linear excursion-set formalism and consider the collapse of two neighboring points. While the 
exact solution to this problem can only be obtained numerically, we show that such results can be 
reproduced analytically with great accuracy by introducing a simple and well-motivated approxi- 
mation. These analytical expressions then provide an extension to the peaks framework that can 
be used to quickly and easily address issues of inhomogeneous structure formation. 

The structure of this work is as follows. In §2 we review the derivation of the Press-Schechter 
formalism used to construct the average mass function of halos in the universe. In §3 wc extend this 
formalism using an approximate form of the two-point density distribution, which we compare to 
exact numerical results. In §4 we construct the mass function in the neighborhood of an overdense 
region with a fixed mass and collapse time, and estimate the nonlinear bias between halos in this 
model. Our conclusions are summarized in §5, and appendixes are included that provide explicit 
expressions that are necessary to evaluate our analytical results and describe a publicly available 
code that makes it easy to use our formalism for specific applications. Wc have added an Erratum 
(immediately preceding the appendixes) which adds and discusses a previously missing reference. 

2. Halo Collapse Around a Single Point 

Before addressing the formation of virialized halos at two correlated points, we first review in 
this section the approach of Bond et al. (1991) which leads to the standard one-point expressions. 
We work with the linear overdensity field (5(x, z) = p(x, z)/p{z) — 1, where x is a comoving position 
in space, z is the cosmological redshift and p is the mass density, with p being the mean mass 
density. In the linear regime, the density field maintains its shape in comoving coordinates and 
the overdensity simply grows as 5 = 5iD{z)/ D{zi), where Zi and Si are the initial redshift and 
overdensity, and D{z) is the linear growth factor [given by eq. (10) in Eisenstein &; Hu (1999)]. 
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When the overdensity in a given expanding region becomes non-linear, the expansion halts and the 
region turns around and collapses to form a virialized halo. 

The time at which the region virializes can be estimated based on the initial linear overdensity, 
using as a guide the collapse of a spherical top-hat perturbation. At the moment at which a top hat 
collapses to a point, the overdensity predicted by linear theory is Sc = 1.686 (Peebles 1980) in the 
Einstein-de Sitter model. This value depends weakly on the cosmological parameters, but in the 
quantitative plots shown in this paper we fix 6c = 1.686 for simplicity. Our analytical expressions, 
however, do not depend on this particular choice. 

A useful alternative way to view the evolution of density is to consider the linear density field 
extrapolated to the present time, i.e., the initial density field at high redshift extrapolated to the 
present by multiplication by the relative growth factor. In this case, the critical threshold for 
collapse at redshift z becomes redshift dependent. 



We adopt this view, and throughout this paper the power spectrum P{k) refers to the initial power 
spectrum, linearly-extrapolated to the present (i.e., not including non- linear evolution). 

At a given z, wc consider the smoothed density in a region around a fixed point A in space. We 
begin by averaging over a large mass scale M, or, equivalently, by including only small comoving 
wavenumbers k. We then lower M until we find the highest value for which the averaged overdensity 
is higher than 6c{z) and assume that the point A belongs to a halo with a mass M corresponding 
to this filter scale. 

Note that this description of structure formation is essentially a Lagrangian one, as it gives us 
no information as to the motions of peaks. Instead, this approach provides information only as to 
the size of the halo in which the material initially at a point A is contained. This distinction will 
prove especially important when considering two-point quantities, but must be kept in mind even 
when interpreting the one-point results. 

In this picture wc can derive the mass distribution of halos at a redshift z by considering the 
statistics of the smoothed linear density field. If the initial density field is a Gaussian random field 
and the smoothing is done using sharp fe-space filters, then the value of the smoothed 6 undergoes 
a random walk as the cutoff value of k is increased. If the random walk first hits the collapse 
threshold Sc{z) at k, then at a redshift z the point A is assumed to belong to a halo with a mass 
corresponding to this value of k. Instead of using k or the halo mass, we adopt as the independent 
variable the variance at a particular filter scale k, 



dc{z) = SJD{z) . 



(1) 




(2) 



In order to construct the number density of halos in this approach, we need to find the equa- 
tion that describes the evolution of the probability distribution Q{S,Sk), where Q{S,Sk)dS is the 
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probability for a given random walk to be in the interval 6 to 5+ dS at 5^. Alternatively, Q{6, S^) dS 

can also be viewed as the trajectory density, i.e., the fraction of the trajectories that are in the 
interval 5 to 6 + d6 at Sk , assuming that we consider a large ensemble of random walks all of which 
begin with 6 = at Sk = 0. 

We first examine the evolution of Q in the absence of any barrier. We consider a small step ASk 
in Sk, during which 6 changes by A5. We can obtain Q{S, Sk) by integrating over the probability 
Q{S — A5, Sk — ASk) that we started at the point {S — AS, Sk — ASk), multiplied by the probability 
of making the step {AS, ASk) given the starting point {S — AS, Sk — ASk)- The equation for Q is 
thus 

/•oo 

Q{S, Sk) = / dAS GiAS, ASk) Q{S - AS, Sk - ASk) , (3) 

JA(5=-oo 

where the probability G{AS, ASk) of making the step does not depend on the starting point, and 
is given by the Gaussian 



G{AS,ASk) = 



v'27r ASk 



exp 



(ASl 
2 ASk 



(4) 



To solve the integral equation for Q we expand the term Q{S — AS, Sk — ASk) in eq. (3) with respect 
to S and obtain 



Q{S, Sk) = Q{S, Sk - ASk) - {AS) + --^^^ 



{Asy 



(5) 



where on the right-hand side all the Q's are evaluated at {S, Sk — ASk)- The expectation values on 
the right-hand side refer to the probability distribution G{AS, ASk), and they are simply {AS) = 
and ((A(5)^) = ASk- The term Q{S, Sk - ASk) can then be expanded as Q{S, Sk) - ASk^, while 
we can substitute Sk for Sk — ASk in the other terms on the right-hand side, as this difference 
would correspond to higher-order terms and can be neglected. With these substitutions we obtain 
a diffusion equation, 

dQ 1 d'^Q 



dSk 2 5(52 ' 

which is satisfied by the usual solution which we label Qq-- 



Qo{s,Sk) 



: exp 



^2 1 



2Sk 



G{S, Sk 



(6) 



(7) 



To determine the probability of halo collapse at a redshift z, we consider the same situation but 
with an absorbing barrier at S = v, where we set v = Sc{z)- The fraction of trajectories absorbed 
by the barrier up to Sk corresponds to the total fraction of mass in halos with masses higher than 
the value associated with Sk- In this case, the equation satisfied by Q is exactly the same as in the 
above derivation, because the chance of being absorbed by the barrier over the interval ASk goes 
to zero exponentially as ASk 0, and the barrier has no effect to first order. Thus the solution 
with the barrier in place is given by adding an extra image-solution: 



Q{i^, S, Sk) = Qo{S, Sk) - Qo{2i^ - S, Sk)- 



(8) 
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Using this expression, we can calculate the fraction of all trajectories that have passed above the 
barrier v by to be 



(9) 



F{v, Sk) = l- (16 Q{u, 5, Sk) = 2 I d6 Qo{S, Sk) . 

J —oo 

The differential mass function is then determined by 



85 



5=u 



2^S 



3/2 



exp 



(10) 



where we have used the fact that Qq satisfies eq. (6). As /(z^, Sk) dSk is the probability that point 
A is in a halo with mass in the range corresponding to Sk to Sk + dSk, the halo abundance is then 
simply 

dSk 



dn p 
dM ~M 



dM 



f{^,Sk) 



(11) 



where dn is the comoving number density of halos with masses in the range M to M + dM. The 
cumulative mass fraction in halos above mass M is similarly determined to be 



F{> M\z) = erfc 



(12) 



While these expressions were derived in reference to density perturbations smoothed by a sharp 
/c-space filter as given in eq. (2), Sk is often replaced in the final results with the variance of the 
mass M enclosed in a spatial sphere of comoving radius r: 



a2(M) = a'^(r) 



1 r 

2^ Jo 



k'^dkP{k)W^{kr) , 



where W{x) is the spherical top-hat window function, defined in Fourier space as 

sin(a;) cos(x) 



W{x) = 3 



x^ 



(13) 



(14) 



With this replacement we recover the cumulative mass fraction that was originally derived in Press 
&; Schechter (1974) simply by considering the distribution function of overdensities at a single 
point, smoothed with a top-hat window function, and integrating from Sc to infinity. In this 
derivation the authors were forced to multiply their result by an arbitrary factor of two, to account 
for cases in which collapsed peaks were contained within a collapsed peak at a larger scale. The 
excursion-set derivation presented here, based on Bond et al. (1991), properly accounts for such 
peaks-within-peaks, however, as well as makes explicit the approximations involved in working with 
(T^(r). Strictly speaking, dealing with a real-space filter requires a complete recalculation of /{u, a^) 
which accounts for the correlations intrinsic to W{x). However, simply replacing Sk with a^{r) 
in eq. (11) has been shown to be in reasonable agreement with numerical simulations (e.g., Katz, 
Quinn, & Gelb 1993), and is thus a standard approximation. 
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While this standard Press-Schechter mass function, in which 5c = 1.686 and Sk = cr^(r), is a 
useful statistical tool, it is possible to improve its agreement with simulations by adopting more 
complicated choices of these parameters, and adjusting the functional form of eq. (10). Finding the 
ideal model for Sc, Sk and f{u,Sk) has become somewhat of an art, with many authors proposing 
various approaches. Models have been studied in which the mass function is modified by fitting 
to simulations (Sheth, Mo, & Tormen 2001; Jenkins et al. 2001), incorporating the Zel'dovich 
approximation (Monaco 1995, 1997a,b; Lee k. Shandarin 1998, 1999) or even extending the adhesion 
approximation (Menci 2001), itself an extension to the Zel'dovich approximation. While these 
methods improve the accuracy of the Press-Schechter technique, they are concerned with single- 
point quantities and thus probe an altogether different direction than the one explored here. In this 
investigation, then, we focus on generalizing the basic excursion-set analysis to the collapse of two 
neighboring points; we develop a two-point formalism to which improvements to the single-point 
mass function are likely to be directly applicable. 



3. Two-Point Halo Collapse 

3.1. Analytic Preliminaries 

Having reviewed the standard derivation of the one-point halo mass function, we turn to the 
evolution of two points, separated by a fixed comoving distance d. Note that this definition of 
distance is in Lagrangian space, which is intrinsic to any Press-Schechter type approach. Thus it 
is the comoving distance between points A and B at early times, and does not take into account 
subsequent motions of these points. In the two-point case, one quantity that enters is the cross- 
correlation between two objects identified by sharp fc-space filters: 

Ud. Sk) = ^ fj''^ k'^ dk' P{k') , (15) 

where the integration limit k is set by Sk through eq. (2). It is also convenient to define 

^^^'^^^= k{Sk)d ' 

so that ^ 

Ud,Sk)= [ ' v{d,S',)dS', . (17) 
Js'=o 

If we consider a filter ki at point A and /c2 at point B, the cross-correlation involves only those 
/c-values common to both filters, and the result is simply given by eq. (15) where the upper in- 
tegration limit is A; = mm[ki,k2]- Curves of r]{d,Sk) and ^k{d,Sk)/Sk for various values of d are 
shown in the upper two panels of Figure 1. Note that at small values of Sk, when d ^ k~^{Sk), 
rf{d, Sk) approaches 1, and thus ^k tends toward Sk- At high values of Sk, when d S> k''^{Sk), the 
perturbations become uncorrelated, and thus ^k ^ Sk- 
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In this figure and throughout this paper, we illustrate our results for a cosmological model 
corresponding to a Cold Dark Matter (CDM) cosmogony with a non-zero cosmological constant, 
a choice that is based mainly on the latest measurements of the Cosmic Microwave Background 
(e.g. Balbi et al. 2000; Netterfield et al. 2001; Pryke et al. 2001). We fix = 0.3, Qa = 0.7, 
Cli, = 0.05, (Tg = 0.8, h = 0.65, and n = 1, where and $7^ are the total matter, vacuum, 

and baryonic densities in units of the critical density, as = a{8h~^Mpc) as in eq. (13), h is the 
Hubble constant in units of 100 km s""^ Mpc""*^, and n is the tilt of the primordial power spectrum, 
where n = I corresponds to a scale invariant spectrum. Our results apply quite generally to any 
cosmology, however, and thus this model is only an illustration of our approach. 

Just as we needed the real-space variance in the one-point case, many of the relevant two-point 
quantities discussed below depend on the correlation between two spatial filters centered about two 
points at a separation d. In this case, the standard expression is 

^r{d,n,r2) k'dk^-^^^P{k)W{kn)W{kT2) , (18) 

where ri and r2 are the radii of the two filters, and W{x) is again the top-hat window function 
given by eq. (14). Unfortunately, simply substituting this quantity for in analogy with the usual 
one-point ansatz of substituting for Sk does not yield an acceptable approximation. This is 
because in the simple limit d — >^ 0, the correct one-point results [see eq. (43) below] are recovered 
only if ^ — min [cr^ (ri ), (T^(r2)]. In this work, then, we instead make use of 

^r^a:.(^^>n,r2) ^ Cr[(i,max(ri,r2),max(ri,r2)] , (19) 

which is equal to when the two filters have equal radii. 

Wc have also explored an alternative approach, in which we replace with ^kid,Sk) as in 
eq. (15), but now choosing the integration limit k to correspond to real-space filtering, so that 
Sk{k) = (r). In this case we define 

^k{r){d,ri,r2) = ^jfc{d,min[cr^(ri),o-^(r2)]}, (20) 

where k is not proportional to 1/r but instead these quantities are only indirectly related, with r 
determining which in turn determines k. In this approach, the derivatives of ^j.^^) with respect 
to ri and r2 can be expressed analytically in terms of rj, reducing the overall computational load. 

These alternative definitions of the correlation function are illustrated in Figure 1 in which we 
plot ^r{d, a^) I a"^ = ^r[d-,r{a'^)^r{a'^)\la'^ and ^k{d,Sk)/Sk as functions of and Sk respectively. 
This parameterization allows for a direct comparison between and $,r, and shows that while 
these two correlation functions are rather similar, £,r/o''^ is smooth while 6.k/Sk shows oscillations 
at intermediate scales. While these oscillations are small, we have found that they become more 
pronounced in some of the two-point quantities discussed below. Thus we choose in this paper 
to base our ansatz on the smoother quantity Cr-^axi despite the fact that its derivative must be 
calculated numerically. 
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Fig. 1. — Various correlation functions. Upper panel: each line shows rj{d,Sk) as a function of Sk 
for various values of d. From left to right d = 10, 3.3, 1, 0.33, and O.l/i^^ Mpc respectively. Center 
panel: Ckid, Sk)/ Sk as a function of 5^ for the same Lagrangian distances as in the upper panel, 
again arranged from left to right. Bottom panel: ^r{d,cr'^)/o''^ as a function of cx^, again for the 
same distances arranged from left to right. 
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in Fig. 2 we examine how these quantities vary as a function of scale. In the upper panel of 
this figure, wc plot Sk and cr^ as functions of and r respectively. Here we see that while both 
functions decrease monotonically with increasing length scale, cr^ is somewhat greater than S at 
all spatial values, such that roughly Sk{l/r) « (T^(2r). Thus it is more appropriate to compare 
r to 2/k than 1/k. In the lower two panels of this figure, we plot and ^ functions of 
Lagrangian distance for various values of Sk and a^, respectively. Here again we find much the 
same behavior as in Figure 1, with Ck/Sk and ^r/c"^ approaching 1 at small d values, and the points 
becoming uncorrelated at large distances. Note also that ^,k/Sk again contains small oscillations at 
intermediate distances. 



3.2. Two Correlated Random Walks 

With these correlation functions in hand, we now consider the simultaneous correlated random 
walks of two overdensities 6i{Sk,i) and (^2 (5/0,2) separated by a fixed Lagrangian distance d. As in 
the one-point case, for the derivation we adopt sharp fc-space filters. We want to determine the 
joint probability distribution of these two densities, Q{Si,62, Sk,i, Sk,2, d). In terms of a trajectory 

density in the (61,62) plane, Q{6i,62, Sk,i, Sk,2, d) d6i d62 is the fraction of trajectories that are in 
the interval 61 to 61 + d6i and 62 to 62 + d62 at {Sk^i, Sk,2)- Below we will take Sk,i and Sk^2 to be 
the final variances of these trajectories, denoting intermediate variances with the primed notation 
S'l^ ^ and 5^ 2- We then consider a large number of random walks all of which begin with 61 = and 
62 = at 4,1 = and 5^, 2 = 0. 

With sharp fc-space filters, the problem simplifies due to the fact that we are working with a 
Gaussian random field. Suppose we consider random walks 5i(5'^ 1) and (52(5'^ 2) ranges 
< S'f^i < Sk,i and < SjJ, 2 < Sk,2- Since different A;-modes are independent of each other, 
any part of the random walk 61, e.g., in the range Sk < S'f, ^ < Sk + dSk, is only correlated with 
the same range, 5^ < 5^ 2 < 'S'fc + dSk, in the other random walk, with a correlation strength 
determined by ri(d,Sk)- This means that in order to determine the joint probability distribution 
Q{6i,62, Sk^i, Sk,2, d), we do not need to vary S*^ ^ and 5^ 2 independently; instead we can consider 
Q to be a function of a single variable S'jJ. (in addition to the variables di, 62, and d). If the final S*^ 
values are unequal, e.g., Sk^i > Sk,2i then we continuously increase S'f, from to 5^,2, generating 
the two correlated random walks and solving for Q on the line S'^ ^ = S'j^ 2 = S'^. m- the {Sk,i-, Sk,2) 
plane, and we then continue to increase S'j^ up to Sk,i, with only the random walk continuing 
further (with ^ = 5^). 

We thus divide the evolution of Q into two segments, one in which the /c-space filter includes 
only small wavenumbers, such that S'^ is smaller than both Sk^i and Sk,2i and a second segment of 
wavenumbers such that S')^ is between Sk^i and Sk,2- With d fixed, we follow the derivation from 
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r,k-V(h-^Mpc) 




I I I I I 

0.1 1 10 100 

d/(h-iMpc) 

Fig. 2. — Variance and correlation functions as a function of scale. Upper panel: The solid and 
dotted lines show Sk{k~^) and C7^(r) respectively. Center panel: Ck{d, Sk)/Sk as a function of d for 
various values of Sk- From left to right = 10, 3.3, 1.0, 0.33, and 0.1, corresponding to k^^ = 
0.38, 1.3, 3.4, 6.8, and 12 comoving h^^ Mpc respectively. Bottom panel: ^rid,cr'^)/cr'^ as a function 
of d. From left to right o"^ = 10, 3.3, 1.0, 0.33, and 0.1, corresponding to r = 0.70, 2.4, 6.5, 14, and 
26 h-^ Mpc. 
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the one-point case and expand Q{di — ASi,62 — A62, S'j^ — AS'j^, d) to obtain 

^^^as^ = 2 ^^'^^^ + ^^^^^^^^ m^f, + 2 ^^'^ 



Evaluating the expectation values in each regime gives 
dQ 



S'fe,2 < Sk,i (22) 



2-QsT Sk,l <S'^< Sk,2 ■ 



In order to compute the evolution of these points in the regime in which S*^ < Sk^i, Sk,2 it is 
simpler to transform to the uncorrelated variables (5+ = {61 + §2)/-\/2 and 5- = {5i — 62) /V^- In 
this case we obtain 

dQ_ ^ l + v \-ri 

dS'^ 2 ddl^ 2 861 ■ ^ ' 

In these variables the two random walks are independent, and the usual no-barrier solution is 

Qoi6+,S-,Sk,min, ^k) = G[S+,Sk,min + Cfe] G[6-,Sk,mm " Cfc] ) (24) 

where 

Sk,m.m = min(S'fc,i, Sk,2) , (25) 

and the covariancc of 5i and 82 at Sk,min is ^ Ck{d, Sk,min) [see eq. (17)]. The solution at the 
point iSk,i,Sk,2) is then simply obtained by convolving Qo{d+,6-, Sk,rain, ^k) with G{5i, Sk,i - Sk,2) 
or G{62,Sk,2 - Sk,i). 

^From eq. (24) wc sec that in the 5-^- and 5_ coordinates, the evolution of the densities is 
particularly simple, and there are clearly image solutions of the form Qo(w+—S-i-,W-—5-, Sk^j^in, ^k)i 
where and w- are arbitrary constants. But unlike the one-dimensional case, the absorbing 
barriers at Si = vi and 62 = 1^2 are complicated due to the coordinate transformation, where, e.g., 
one of the barriers takes the form S+ + 6- = w, where is a constant. We want a solution for Q 
which is zero on this barrier, but the image solutions do not help since 5+ and S- are coupled in the 
equations that describe the barriers. An exact solution thus requires a numerical approach, in which 
the diffusion equation is solved with the absorbing barrier imposed at each time step. However, in 
the following section we derive an accurate analytic approximation to the exact solution. 



3.3. Two-Step Approximation 

While the full solution of the double barrier problem requires a numerical approach, a closer 
look at the problem leads to a a simple approximate analytic solution that captures the underlying 
physics of two-point collapse. Note that we need only approximate the evolution for < S'j^ < 
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Sfe min, since the evolution for higher values of S'j^ involves only one of the random walks and 
therefore has a simple exact solution. 

We consider the equation for the differential correlation function between the halos as a function 
of mass, eq. (16). While 77 (d, S"^) is an oscillating function, it equals unity at small values of 
and its amplitude steadily declines with S'j. as the corresponding wavenumbcr k enters the regime 
in which /cr S> 1. Thus for small S'^ values, the two random walks are essentially identical, with 
Si{S'p,) ~ 52{S'p.)- In this regime we can simply evolve a single random walk in 5\, and then set 
82 equal to the resulting final value of 5i. At large 5^, on the other hand, •q{d, S'^) — 0, the two 
random walks become independent, and the problem again simplifies. 

These observations lead us to propose a "two-step" approximation in which we replace 77 (d, S'j^) 
with a simple step function that jumps from unity to zero at some value of S'^. We choose this 
jump to occur at a value of S'^. that preserves the exact solution for Q a,t S'^ = 5^,111111 in the absence 
of the barriers. In the no-barrier case, the joint distribution of and 82 at Sk^umi depends only on 
the variance S'fc.min of each of these Gaussian variables, and on their covariance ^fc(d, S'^^min) which 
is given by eq. (17). In order for the integral in eq. (17) to be the same for both the exact function 
•qid^ S'f,) and for the step-function approximation, we must fix the step to occur at S]^ = ^k{d, -Sfc,min)- 
Thus, our two-step approximation is 

,min , 

such that the trajectories are completely correlated when 5^ is less than the cross-correlation 
between the points at Sk,mm, and completely uncorrelated when S'j^ exceeds this value. This ap- 
proximation is compared to the exact form of ri{d, S'jJ for various values of S'j^ in Fig. 3. 

We combine this approximation for the evolution of the two random walks for < Sk,min 
with the single random walk which continues for S'j^ > Sfc^min, resulting in the following overall 
prescription with absorbing barriers at 5i = ui and 5^ = 1^2- First, we evolve 61 for < S'j^. < 
(,k{d, Sk^niin)- Since we are assuming that the two random walks are identical in this regime, we 
must place the barrier on Si at 

t'min = min(zyi, 1/2) . (27) 
Quantitatively, the single absorbing barrier solution, eq. (8), gives Q at S'j^ = ^k{d, Sk^min), 

Qa(z^min,<5l,(52,6) = [G{Si,^k) - G(2l^min - Si,^k)] Sd{Si - (52)0(i^min - Si), (28) 

where 6d is a one-dimensional Dirac delta function, 9 is the Heaviside step function, and here and 
in the rest of this section refers to ^kid, Sk,mm)- We then set S2 = Si and evolve the random walks 
in Si and S2 independently up to Sk,i and Sk,2, with the barriers at vi and ^2, respectively. Thus, 
we first convolve eq. (28) with the no-barrier solutions for the two independent random walks. 



QbiSi,S2,Sk,i,Sk,2,^,k) = G{6i,Sk,i - ik)G{62,Sk,2 - Cfe) > 



(29) 
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Fig. 3. — Comparison between r]{d^S'j^) calculated exactly (solid lines) and in the two-step approx- 
imation (dotted lines), for various values of Sk^min- In all plots d = 3.3 comoving Mpc. 



-15- 



which gives 



Q-(l^min, 2fmin — ^i, 2u,^in — ^2i SkA, Sk 2,Ck) ) (30) 



where 



Q±{l^mm,Sl,S2, Sk,l, Sk,2,^k) = 



^7r^Sk,iSk,2 - ek 



exp 



5lSk^2 + SilSk,l - 2(5i(52Ca: 

'2{Sk,iSk,2 - il) 




and 



S = 



ik{Sk,i - ik){Sk,2 - ik) 

Sk,lSk,2 — Cfc 



v = 



Si 



Sk,l — Cfc Sk,2 — Cfc 



(31) 



(32) 



Finally, we account for the additional barriers with reflections about the vi and 1^2 axes, which 
yields 



Q{l'l, f2, 61, S2, Sk,l, Sk,2, ^k) 



Qo{l'min,Sl,S2, Sk,l, Sk,2,S,k) + 
Qoil^min, 2j^1 — 5l, 2z/2 — 62, Sk^l, Sk^2, Ck) 
Qo{l^min, Sl,2v2 — 62, Sk^l, Sk,2, ^k) — 
Qo(t'min, 2j^i -61,62, Sk,l, Sk,2, ^k) ■ 



(33) 



Following the common approximation taken in the single particle case we again replace S^^i and Sk,2 
with (T^(Mi) and (7^(M2), respectively. Similarly, we also replace ^k with the correlation function 
of real-space peaks ^r^ax ^ given by eq. (19). Using this expression we can now construct the 
combined mass function of halos at two points separated by a comoving Lagrangian distance d. 
Before we derive this function, however, it is important to understand the errors introduced by our 
simple two-step approach. Thus, we compare eq. (33) with exact numerical solutions in order to 
show that our analytic approximation is accurate over a broad range of parameter space. 



3.4. Numerical Approach and Comparison with Two-Step Approximation 

In order to solve eq. (23) in the presence of absorbing barriers at fixed values of 6i and 62 
we have developed a simple finite difference code. We construct a 400 x 400 zone mesh in 6+ 
and 6^ spanning the range from -SJscaic < 6+ < 5 6scaic and -5(5scaic < 5- < 5(5scaic, where (5scaic 
is a typical overdensity of interest. In this case the width of each zone, dx, is 0.025 (^gcaie in both 
dimensions. On this grid, we construct Qjj = Q{6+^i, 6-j, 5*^(0) where i and j are spatial indices in 
each of directions and t is a "time" counter such that S'i^{t) = t dS'f., where we take dS'f, = (5gj,g^g/3000 
to be the interval by which we refine our A;-space filter at each time step. 
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Initially the distribution is taken to be a delta function, such that Qq q = 1 and Q^j = for 

all other values of i and j. We calculate the values at each new step in using an alternating- 
direction implicit method (Press et al. 1992) in which we divide each time step into two stages of 
size dS'j^/2, and solve first in the x and then the y direction. In this case eq. (23) becomes 



Q^i^iii ~ "^QIj ^ + Ql+i,j + ~ [Qi,j-i ~ "^Qid + Qi,j+i] 

Qlf., - 2Ql+' + Ql+l, , (34) 



J ^Vjj -t- Vi+ij 



where = , a_ = ' ^^"^ system can be solved using simple inversions 

of tridiagonal matrices. Finally, at the end of each time step we impose the absorbing barriers at 
5i = vi and 82 = V2 by setting Q*^"^'"" = at all points at which b^^i + > \plv\ or S^^i — 6-j > 
\plv2- With this code we are able to examine the accuracy of our two-step approximation for a 
number of physical cases. 

For any given redshifts z\ and ^2, we fix the barriers at the collapse thresholds v\ = 5c (^1) and 
^2 = Sc{z2), and consider the following quantity: 

d6i dS2Q{iyi,i^2,Si,52,Sk,i,Sk,2,^k) ■ (35) 

-00 J —00 

This is the fraction of trajectories that are not absorbed before the random walks reach the point 
{Sk,i, Sk,2)- Recall that Sk can be related to the mass scale of the collapsed objects by the simple 
ansatz Sk —>■ (T^(r) as described in §3.1 Thus equation eq. (35) can be interpreted as the probability 
of point A being in a halo of mass M < Mi(cr^ = -Sfe^i) and point B in a halo of mass M < 
M2{a^ = Sk,2)- Since F is the basic quantity that we later use to calculate various halo properties, 
a reasonable way to judge the success of our approximate method is to test its ability to closely 
reproduce this function. In the comparisons (Figures 4 and 5) , we use the /c-filter quantities and 

In Figure 4 we compare analytical and numerical values of Flui, 1^2, Sk^i, 8^,2-, Cfe) as a function 
of Sk,i = Sk,2 = -S* for various distances and collapse redshifts. In the left three panels of this 
figure, we consider large objects collapsing at different redshifts by fixing ui = 4.13 = Sc{zi = 2), 
ui = 5.47 = Sc{z2 = 3), (5scaie = 3.0 and varying the distances as labeled. In the right panels we 
consider the simultaneous collapse of somewhat smaller objects at earlier times fixing ui = 1^2 = 
12.35 = 6c{z = 8) and Sgcalc = 8.0. The figure shows that both for peaks that form simultaneously, 
and for peaks that collapse at difi^erent times, the two-step solution tracks the numerical solution 
over a large range of halo masses and distances. Note also that these expressions match each other 
closely even when they are very different from both the correlated and the uncorrelated cases. 

In Figure 5 we fix three values of S^^i = Sk,2 = S and the collapse redshift of a single halo 
21 = 5, and consider F as a function of the collapse redshift Z2 of the second halo, at three different 
separations. The figure shows that our approximation does well at reproducing the numerical 
results even for cases in which the collapse redshifts between the two objects are very different. As 
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Fig. 4. — Comparison between numerical solution and two-step approximation as a function of 
S. In the left panels we consider two peaks that collapse at redshifts 2 and 3, respectively, while 
in the right panels we consider two peaks that collapse at z = 8. The panels are labeled by the 
comoving Lagrangian distance between the peaks, and in each panel the numerical solution and 
the two-step solution are given by the solid and dotted line, respectively, while for comparison, 
the fully uncorrelated solution is given by the short-dashed line and the fully correlated solution 
is given by the long-dashed line. Note that the two-step approximation is almost indistinguishable 
from the numerical solution, in every panel. Note also that in this plot we use Sk for S and for 
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before, the numerical and approximate results are in good agreement over a wide range of values 
for which the fully-correlated and fully-uncorrelated expressions are not accurate. 

Besides the results shown in these figures, we have conducted extensive convergence tests 
varying dS and dx, and have found that reducing any of these parameters has no effect on our 
numerical results. Thus we are confident that the differences between the analytic and exact 
solutions, as presented in Figures 4 and 5, are in general small, and the errors in our analytical 
expressions are at most 2%. 

4. Structure Formation with the Two-Point Formalism 

4.1. Bivariate Mass Function 

Having developed in the previous section an accurate approximation to the joint probability 
distribution at two points, we now apply this distribution to derive the two-point generalization of 
the Press-Schechter mass function of collapsed halos. This generalization is the joint probability of 
having point A lie in a halo in the mass range Mi to Mi + dMi at a redshift zi and point B lie in 
a halo in the mass range M2 to M2 + dM2 at a redshift Z2- Note that the derivation in §3.2 applies 
to any redshifts zi and Z2 and thus our calculation can determine the number density of halos at 
B both before and after the formation of a halo at A. 

Again, it is important to emphasize that this joint mass function is defined in the Lagrangian 
coordinate system that arises naturally in an excursion-set approach. Thus the distance between 
the points in physical space may be somewhat different than the d considered in our equations, 
as the relevant comoving distance in our case is that separating the points at early times. This 
coordinate system has both its weaknesses and its advantages. While it somewhat complicates the 
direct comparison of our expressions with numerical simulations, this can in general be remedied 
with estimates of the final, Eulerian halo coordinates (e.g.. Mo Sz White 1996). Furthermore, for 
many applications, Lagrangian results are in fact preferable to an Eulerian description in which 
halos are indexed by their physical coordinates with no reference to where these perturbations 
came from. In studies of spatially-dependent feedback in structure formation, for example, it is 
often more important to have a measure of the total column depth of material separating two 
perturbations than their precise distance in physical space. 

In this section, we adopt a general notation for the variances and correlation functions, using S 
to represent either the /c-space filtered quantity, S^, its real space equivalent cr^(r), or any alternative 
definition. Similarly, ^ denotes ^rmaxi ^k{r)i or any alternative definition of the correlation function. 
With this notation, we now consider f{i'i,i'2, Si, S2,S.) dSi dS2, the probability of having point A 
in a halo with mass corresponding to the range Si to Si + dSi and point 5 in a halo with mass 
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Fig. 5. — Comparison between numerical solution and two-step approximation as a function of 
Z2- The panels are again labeled by the comoving Lagrangian distance between the peaks, and in 
each panel the numerical solution and two-step solutions are given by the solid and dotted line, 
respectively, and the fully uncorrelated and fully correlated solutions are given by the short-dashed 
and long-dashed line, respectively. All four lines coincide for S = 2. For 5 = 4 and = 6, our 
approximation and the numerical calculation are almost indistinguishable, but the uncorrelated 
and the fully correlated lines lie significantly lower and higher, respectively. Note that in this plot 
we use Sk for S and for 
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corresponding to the range S2 to S2 + dS2- This is simply related to the quantity in eq. (35) as 



/(l^l,I^2,5'i,5'2,0 = dh j d(^2 1^2,^1,^2, 5*1, 5*2,0 , 



(36) 



where ^ is not considered an independent variable (and so the partial derivatives involve variations 
of 0- This simplifies to 



/(Si, 52,2^1,1/2,0 = 



_d d_ 

dSi dS2 



/fi poo ~\ V /"oo 

d5i - d6i 2 d52- d62 
-00 J —00 J L J —00 J —00 



Qo{l^l,l^2,Sl,d2,Sl,S2,0 ■ 

(37) 



In order to perform these integrals, we must consider Qq from eq. (30) in its unintegrated form, 
written as a convolution with an integration variable x: 



dx [G{x, - G{2u^in - X, 0] GiSi -x,Si- 0(62 - x, ^2 - • 

-00 



(38) 



Since 5i and Si appear only in a single term in the x-integration, and similarly for ^2 and 62, Qo 

ScttjisflGS 



+ 



and 



+ 



(39) 



dSi 2 dSj dSi ' dS2 2 dS^ 882 ' 

where we consider ^ to be an independent variable when calculating 

We perform the partial derivatives with respect to 5*1 and S2 on the integrand in equation 
(38), obtaining four terms. Terms that contain partial derivatives with respect to 5i or 82 allow us 
to perform the 5\ or ^2 integrations in eq. (37) trivially, while the evaluation of terms containing 

is more involved. 

To compute , we note that the integrand in eq. (38) is a product of three terms, which we 



denote, respectively from left to right, as /i, I2, and I^. These satisfy the equations 

and 



dh Id'^h , a/2,3 _ 1 9^/2,3 

2 ' 



(40) 



2 0x2 ' 

which allow us to perform the integration with respect to x, using integration by parts to eliminate 
all double derivatives. This yields an integrated term, whose contribution to f{i>i,i'2, Si, 82,$,) is 
zero, and the remaining term in is 



J — ( 



dx 1 1 dxh dxh ■ 



Finally we use dxl2 = —dsih and dxh = — c^52-^3 further simplify any term that contains a 
partial derivative with respect to ^, yielding 



f {1^1,1^2,81,82,0 = 



861862 



+ 2 



d8i 861 982 861 



8^ 8^ dQo d^i 
881 882 8^ 8818S2 



Qo 



61=1/1, 52=^2 

(41) 
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Note that this expression is discontinuous for some choices of S and ^. This is true both for 
the fc-spacc-filtcring case for which cqs. (31) and (33) were derived, as well as the real-space case 
in which ^ = Crmax S = a^, because in each case ^ is a function of S'min whose derivatives with 
respect to Si and 5*2 are not continuous. While this may seem at first to be a serious problem, a 
closer look at these discontinuities shows that they are in fact quite harmless. 

To see why this is true, consider the case in which 5*1 and ^2 are nearly equal. If Si = S2 — e, 
then only the and terms of eq. (41) are nonzero (since in this case ^ is a function of 

-S'min = Si), while if = 52 + e then the only nonzero terms are the mixed derivative term and 
Now, in the limited case in which vi = 1^2, and are equal at = S2 and / is 

continuous at this point, while H I'l ^ 1^2, f is discontinuous at Si = S2- 

This discontinuity at vi 7^ 1^2 is not a limitation of our method, however, but is instead a 
true reflection of the physical situation. This is clearest in the d = case in which the two points 
coincide. In this case, if ui < 1^2 and thus zi < Z2, taking Si = S2 — e corresponds to Mi > M2 
which is simply the accretion of mass over time. Taking Si = S2 + e, however, would correspond to 
losing mass with time, which is contradictory to our most basic assumptions. Thus, approaching 
Si = S2 from different directions has two different meanings, only one of which is relevant to 
structure formation. The problem therefore arises from defining an effective arrow of time and 
considering only the first-crossing distribution with respect to it. Thus, the discontinuity only 
occurs in transitions between physically relevant choices of parameters and regions of parameter 
space that contradict the whole premise of our approach, and it need not concern us. 

In Appendix A we give all the expressions necessary to evaluate eq. (41) explicitly. We can 
thus analytically compute the joint halo abundance as 



'^'"'^ - P - — fiu,,^,,Si,S2,0 , (42) 



dMidM2 Ml 



dSi 



dMi 



P 

M2 



dS2 



which reduces to the product of and as given by eq. (11) in the limit of large distances, 
when ^ ^ 0. Note that the last two terms in eq. (41) are identically zero if we use or &max ^ 
our definition of ^. 

In Figure 6 we plot the normalized mass function, setting = (T^(Mi), 52 = a^{M2), ^ = 
^Tmaxid, Ml, M2), and fixing z^i = 1^2 = ^c{z) for various redshifts, in the same ACDM cosmology 
considered in §3. Each function is normalized by its uncorrclatcd value in order to emphasize the 

features of the joint distribution, i.e., each panel shows ^^^'^m^ 
for various values of d, Mi, and z. 



as a function of M2 , 



drii dn2 
dMi ^ dM2 



In the left panels of this figure we fix Mi to be 10^^h~^ Mq, corresponding to a typical L^, 
galaxy, and construct the normalized number density at redshifts 1 and 2, and Lagrangian distances 
of d = 1, 3.3, and 10 comoving h~^Mpc. At the larger distances, the presence of a perturbation 
at point A enhances the formation of an object at point B, unless M2 is so large that such a halo 
at point B would very likely absorb point A into it as well. Thus the d = 3.3 and 10 comoving 
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h Mpc curves are enhanced at all but the largest M2 values. This enhancement is more significant 
at z = 2 as these peaks are rarer, and hence more highly biased. 

As the points get closer, however {d = lh~^ Mpc), the lines become strongly peaked at M2 = 
Ml, excluding the formation of objects of vastly different sizes at a short distance. Indeed, as d ^ 0, 
f approaches the single point differential mass function times a delta function, as both points must 
belong to the same peak in the limit when the two points become identical. Real applications 
of these results at short distances must consider more explicitly the issue of halo exclusion, i.e., 
the fact that a given halo at point A contains all the mass from some region, and this halo either 
contains the point B or not. More generally, the two regimes M2 < Mi and M2 > Mi should be 
considered separately because of their different physical interpretations (especially when d is small); 
in the first case, M2 is a small halo that may be accreted by Mi, and in the second case, M2 is a 
large halo that may be about to absorb Mi into it. 

In the right panels, we consider the case of a dwarf galaxy with Mi = 10^ h^^ Mq, forming at 
redshifts z = 4 and z = 6, such that i^(2;)/a"(Mi) and v{z) /a{M2) cover a similar range of values 
as in the L*-galaxy case. As the virial radii of these systems are smaller by a factor of 10, in these 
panels we fix d = 0.1, 0.33, and 1.0 comoving Mpc. These plots display essentially the same 
features as in the more massive case, although the effects of the correlations are somewhat larger 
since the objects are slightly rarer. 

In Figure 7 we again plot the normalized number density of two points forming at the same 
redshift, but now holding M2 fixed and varying the distance between the two perturbations. Here 
we see that as the points come closer to each other, the number densities are initially enhanced, 
as the collapse of an overdensity at the first point makes it likely that a large-scale overdensity 
enhances halo formation at all nearby points. As the perturbations are drawn closer together, the 
curves drop sharply if Mi 7^ M2 as it is impossible for two perturbations of different sizes to form at 
the same position and redshift. If Mi = M2 however, the probability continues to rise dramatically 
at small distances, as / approaches the single point differential mass function multiplied by a 
delta function. Again these effects are stronger at higher redshifts, as rare peaks are more highly 
correlated. 

Our formalism is not restricted to the collapse of two points at the same redshift, but is also well 
suited to study the formation of halos at varying times. In Figures 8 and 9 we again examine the 
formation at 2; = 1 of an Mi = lO^^/i^^ Mq halo associated with an galaxy, and the formation 
at zi = 4 of an Mi = 10^^~^ Mq halo associated with a dwarf galaxy. We now consider M2 to be a 
smaller halo, formed at a distance d and at an earlier redshift. This is essentially a generalization 
of the usual progenitor problem, where now the lower mass halo need not be absorbed into the 
larger halo, but can be located an arbitrary distance away. 

In Figure 8 we consider the normalized number density as a function of M2 for a variety of 
distances, fixing zi = I and 2:2 = 2 and 4 in the Mi = lO^^h^^ Mq case shown in the left panels, 
and fixing zi = A and Z2 = Q and 10 in the Mi = W^h^^ Mq case shown in the right panels. Many 
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Fig. 6. — Normalized number density of correlated halos as calculated from eq. (42) as a function 
of M2, for various values of distance, Mi, and redshift. In the left panels Mi is held fixed at a value 
of 10^^/i~^ Mq, roughly corresponding to an L^. galaxy, at redshifts of 1 and 2. Here the solid lines 
show, from top to bottom (in terms of the highest point in each curve), results with the Lagrangian 
distances between the two points fixed at 1.0, 3.3, and 10 comoving h^^ Mpc respectively, while for 
reference the uncorrelated case is plotted as the dotted line at 1 and the mass at which Mi = M2 
is given by the vertical dashed line. In the right panels, we fix Mi at 10^h~^ Mq corresponding to 
a dwarf galaxy. Here we consider a higher range of redshift values (zi = 22 = 4 and zi = Z2 = 6) 
roughly corresponding to the same values of v/a used in the Mi = lO^'^/i"^ Mq case. In these 
panels the solid lines show normalized densities at d = 0.1, 0.33, and 1.0 comoving Mpc, from 
top to bottom. 
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Fig. 7. — Normalized number density of correlated halos as calculated from eq. (42) as a function 
of d, for various values of Mi, M2, and redshift. Again in the left panels Mi is held fixed at a value 
of W^^h~^ Mq at redshifts of 1 and 2, while the solid lines show, from top to bottom (in terms 
of the highest point in each curve), densities with M2 fixed at W^^h~^ Mq, 3.3 x 10^^ Mq and 
W^^h~^ Mq, respectively, and again the uncorrelated case is plotted as a dotted line at 1. Similarly, 
in the right panels, we fix Mi to be l(fh^^ Mq at redshifts 4 and 6. In these panels the solid lines 
show normalized densities at M2 = lO^/i'^M©, 3.3 x IO^/i^^Mq, and l{fh~^ Mq, from top to 
bottom. 
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features in this plot parallel the zi = Z2 cases: at the larger distances, the curves are enhanced at 
most M2 values, as a second galaxy is likely to form near the first; and if the objects are too close 
together and have masses that are very different then they can suppress each other's likelihood of 
formation. 

Unlike the simultaneously collapsing cases, when Mi = AI2, f does not approach a delta 
function as d ^ 0. Instead, the formation of objects of equal mass at both points is completely 
excluded, as this would correspond to the same halo forming in the same place at two different 
times (while the formalism assumes continuous mass accretion for every halo). Each curve is instead 
peaked at an M2 value that is smaller than Mi , as this value roughly corresponds to the most likely 
progenitor at a redshift Z2 for the larger halo, which forms at zi. These curves can be compared to 
the number densities expected for a progenitor at the same point (i.e., at d = 0), first derived by 
Bond et al. (1991), 



1^2 - V l 



■ cxp 



{U2 - Vif 



-,3/2 



exp 



25i 



(43) 



which is shown by the dashed lines. Note that as d — 0, / approaches this expression exactly. 

In Figure 9 we study the generalized progenitor problem as a function of distance, again fixing 
z\ = 1 and Z2 = 2 and 4 when Mi = 10^^ Mq, and fixing z\ = A and Z2 = 6 and 10 when 

Ml = 10^ Mq. In the upper panels, in which the differences in rcdshifts arc relatively small, the 
normalized bivariate number density is in general more strongly enhanced, the larger the progenitor 
object and the closer it lies to the lower redshift halo Mi. The only limit in which this enhancement 
does not increase as the objects get closer, in fact, is the limit in which M2 — > Mi, which must be 
excluded as the same object cannot form at the same point twice. 

As the difference between zi and Z2 becomes greater, however, the behavior of the normalized 
number density becomes more complex, as is illustrated in the lower panels. We find in this case 



that at all except the largest distances. 



dMidM2 



drii y dn2 
dMi ^ dM2 



no longer increases monotonically 



as a function of M2. This is because even when d is large enough that it is physically possible to 

form objects with masses M2 = Mi at both points, the presence of a large object at high redshift 
at point B makes it likely that point A will instead be absorbed into an even larger halo that forms 
at point B at some intermediate redshift zi < z < Z2- Thus, the case Mi = M2 is suppressed for 
almost all d values, while the densities are actually enhanced in the presence of a more modest 
high-redshift perturbation. Note that again in all cases / approaches eq. (43) exactly at small 
distances, reproducing the usual progenitor distribution at a single point. 



4.2. Bivariate Cumulative Mass Fraction and Nonlinear Bias 



Returning to eq. (35), we can construct the fraction of trajectories that have been absorbed 
by both barriers. Writing Q in terms of the x integral and performing the appropriate Si and 62 
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Fig. 8. — Normalized number density of halos as a function of M2 for two objects forming at 
different redshifts. In the left panels, we fix Mi = 10^^/i~^ Mq and zi = 1, and consider the second 
object as a generalized progenitor, forming at a higher redshift at a point nearby. In these panels, 
each of the solid lines correspond, from top to bottom (in terms of the highest point in each curve), 
to d values 1, 3.3, and 10 comoving /i^^ Mpc, and we consider cases in which Z2 = 2 and Z2 = 4. 
In the right panels, we fix Mi = 10^ /i^^ Mq and zi = 4, while the solid lines now correspond, from 
top to bottom, to d values of 0.1, 0.33, and 1.0 comoving Mpc, with 22 = 6 and Z2 = 10. Also 
shown for reference in all panels are the dotted lines at 1, corresponding to the uncorrelated case, 
and the fully-correlated dashed lines, corresponding to the formation of both objects at the same 
point, as given by eq. (43). 
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Fig. 9. — Normalized number density of halos as a function of d for two objects forming at different 
redshifts. In the left panels we fix Mi = lO^^/i"^ Mq and zi = 1, and consider M2 values of IQH, 
3.3 x 10^1, and W^'^h~^ Mq, at redshifts Z2 = 2 and Z2 = 4. In the right panels, we consider a 
smaller, higher redshift object and fix Mi = W^h~^ Mq and Z4 = 4. In this case we take M2 values 
of 10*^, 3.3 x 10*^, and lO^h'^ Mq, at redshifts Z2 = 6 and Z2 = 10. Each solid line is labeled by 
its corresponding M2 value, while the fully uncorrelated and fully-correlated cases are shown in all 
cases by the dotted and dashed lines respectively (except that for Mi = M2, the fully-correlated 
limit is a normalized density of zero) . 



-28- 

integrals this becomes 

F(.u^,S„S.,0 = £'"<fa |G(x,0 - G(2.„. «f [-^^) erf (;^^) • 

(44) 

This is the product of the mass fraction in halos of mass below 1/1(5*1) at rcdshift zi{vi) and 
the mass fraction in halos of mass below M2{S2) at Z2{i'2)-, given that the distance between the 
halos is d. In other words, if we divide F by the one-point value of the mass fraction in halos of 
mass < Mi(5i) at 21(1^1), we obtain the biased mass fraction in halos of mass < 1/2(^2) at -22(2^2), 
where the biasing refers to an average only over points at a distance d from the population of halos 
< Mi(5i). In order to construct the correlation function of rare, massive halos, we instead need 
a closely related quantity; this is the "bivariate cumulative mass fraction," that is the product of 
mass fractions in halos with masses above Mi and M2, given by 

F{ui, 1/2, < ^1, < -S2, = 1 + 1^2, Si, S2, - erf " erf (^) • (45) 

In Figure 10 we plot F(z/i,t'2,< Si,< 5*2, 0> setting 5*1 = cr^(Mi), 5*2 = (t^(M2), and $, = 
^Tmaxid), and fixing 1/1 = 1/2 = 5c{z) at various redshifts. In the left panels of this figure we again 
set Ml = W^'^h^^ Mq, z = 1 and 2, and d = I, 3.3, and 10 comoving hr^Mpc. At the smallest 
distances, the two points are almost completely correlated, and F approaches its one-point value of 
erfc[z/(z)/-^/2a%Mmax)] as given by eq. (12), where Mmax is the greater of the two masses. This is 
because the plotted quantity expresses the joint probability of a halo of mass > Mi at point A and a 
halo > M2 at point B. When the two points are very close together, both points are contained within 
the same halo, which therefore must have a mass greater than Mmax- Thus if, e.g., d = lh~^ Mpc 
and M2 < Ml, F is roughly constant at a value corresponding to Mi, while F quickly approaches 
erfc[z/(z) /a/2cj2(M2)] when d = 1 and M2 > Mi. Note also that at d = 3.3h^^ Mpc the points are 
somewhat uncorrelated at small values of M2, but at larger value of M2 the overlap between the 
points becomes much larger, and thus they become almost completely correlated, moving towards 
the d = Ih"^ Mpc case at the highest M2 values. Finally, at d = lOh"^ Mpc, the points are almost 
uncorrelated and F closely approximates the product of two independent one-point probabilities, 
as given by the dotted lines. At this large distance, correlations are only somewhat significant at 
the highest M2 values, moving F slightly away from the dotted lines. 

In the right panels, we again consider Mi = lO^h-^ Mq, z = i and z = 6, and d = 0.1, 0.33, 
and 1.0 comoving h-^Mpc. The main features are similar to those seen in the higher-mass cases, 
although the correlations between the two points are somewhat stronger, moving the d = 1.0 and 
0.33 curves away from the dotted, uncorrelated cases, and up towards the almost fully correlated 
d = 0.1 line. Again this line is roughly constant at a value corresponding to Mi for low values of 
M2, and moves towards erfc [(1/(2;)/ 1^2(7^ (M2)] at larger M2 values, and again the correlations for 
the d = 0.33 and 1.0 h'^ Mpc cases become more significant at large values of M2 . 

Returning to eq. (45), we can also immediately obtain the biasing of rare halos; namely, the 
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Fig. 10. — Bivariate cumulative mass fraction as a function of M2, for various values of d, Mi, and 
z. In the left panels Mi = Mq, and z = 1 and 2. From top to bottom, the solid lines 

in these panels show F values with the Lagrangian distances between the two points fixed at 1.0, 
3.3, and 10 comoving Mpc respectively, and the uncorrelated case is shown for reference by the 
dotted lines. In the right panels, Mi = 10^ Mq, and z = 4 and 6. In these panels the solid 
lines show, from top to bottom, F values with d = 0.1, 0.33, and 1.0 comoving Mpc, while the 
dotted lines again correspond to the uncorrelated case. 
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bias between halos of masses larger than M{S) at a distance d at a redshift z{i'), defined as the 

increase in the cumulative mass fraction at the second point given the presence of an object at the 
first point. This is simply given by the bivariate cumulative mass fraction divided by the cumulative 
mass fraction at two uncorrelated points, 



Ci^,2stp(d,-S) + 1 



crfc ^ 






V2Sj\ 



F{,y,i.,<S,<S,ad,S)) . 



(46) 



This equation can be compared with the usual expression used for nonlinear bias (Kaiser 1984), 
which we can rewrite in a form similar to eq. (44) as 



^u,K84{d, S) + l = 







erfc ^ 




V2Sj. 



f 

J — c 



dx G{x, ^) 



erfc 



(u — X 



(47) 



This expression was derived by simply integrating over the distribution of probabilities at each of 
the two points, in a manner analogous to Press & Schechter's original derivation of the one-point 
mass function. Thus, this expression suffers from the same peaks-within-peaks problem which 
forced Press &; Schechter to multiply their expressions by an arbitrary factor of 2. 

The bias between peaks as computed using the two-step approximation, eq. (46) , and the one 
from the standard approach, eq. (47), are plotted in Figure 11 as a function of the height of the 
peak, in units of v^, for two perturbations of various correlation strengths £,/S. This comparison 
helps clarify the range over which eq. (47) is most accurate, but it also makes clear that eq. (46) is 
a superior generalization which self-consistently accounts for all of the matter in the universe. 

In the fully-correlated limit (^ S), ^u,K84 + 1 is too high by a factor of 2, at all v. This 
is because in this case, the points are fully correlated so the joint fraction is Fi^2 = Fi, where 
Fi = erfc(z>/\/2S') is the cumulative mass fraction of a single halo, and therefore the bias is given 
simply by Fi^2/-^i = While the two-step derivation leads to the correct bias in this limit. 

Kaiser's derivation ignores the Press-Schechter factor of 2 caused by the problem of peaks within 
peaks, and thus (,u.KHd + 1 approaches 2/Fi. Note also that in the two-step case, for all values of 

at I' /a the bias approaches 1. This is because every point is in a halo corresponding to 
some u/a > 0, and thus imposing i//cj > is no constraint at all and does not result in bias. In the 
Kaiser (1984) case, however, this mass conservation is not imposed and instead £,u,K84: ^/S as 
u/a — 0. Finally, at large values of u, when v a/(l — $,/S), the two expressions become equal. 
Note, however, that for sufficiently high values of ^/S, ^u,K84 suffers from the erroneous factor of 2 
problem even in the case of extremely rare peaks. 

Finally, we consider another definition of bias which is also commonly used. This definition is 
designed to yield more directly the (Lagrangian) correlation function among rare peaks at a given 
mass M, rather than the bias among the cumulative halo population above some mass M. In our 
calculation, this correlation function is simply given by 



C.,2stp(d, S) + l = [f{u, S)]-^ f{u, u, S, S, ad, S)) , 



(48) 
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12 3 4 

Fig. 11. — Nonlinear bias of rare peaks, as a function of v l\fS. The cumulative bias is shown in the 
top panel; in each pair of curves the dotted line corresponds to the standard expressions derived 
in Kaiser (1984), while the (lower) solid line correspond to ^!/,2stp- Each pair of lines is labeled in 
terms of the scaled correlation between the two points, which is 0.9, 0.6, and 0.3, from top to 
bottom. The bias of halos at a given mass is shown in the bottom panel; in each pair of curves the 
dotted line shows the biased correlation function that is based on Mo & White (1996), while the 
solid line corresponds to ^jy,2stp- In this case ^jS values of 0.6, 0.3, and 0.1 are used, from top to 
bottom. 
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where we refer to eqs. (10) and (41). In the absence of any previous derivation of this correlation 

function, even in the simple case corresponding to the assumptions of Kaiser (1984), previous 
efforts have resorted to defining the bias in other ways and hoping that these alternative definitions 
yield a value which is close to the desired correlation function. These methods include the "peak- 
background split" , which gives (Cole & Kaiser 1989) 

l.,pk-bgd(ci, 5) = I (^^^^1^) ' • (49) 

In testing these alternative definitions we assume that the ratio between the halo correlation func- 
tion {^u,pk-hgd in this case) and the normalized mass correlation function {(,/S) equals the square 
of the bias. Eq. (49) is derived only in the limit of rare halos and large d, so we consider its 
generalization (Mo &; White 1996), 



2 

dx 



In this expression, 5c = 1.686 (see §2) and Sq is the of the mass scale Mq, where a sphere (at 
the mean density) of comoving radius d contains a mass Mq. Note that Cv.mwqg was calculated 
from eq. (43), using a definition of bias which considers the number density of halos of a given 
mass that form out of matter initially contained within larger overdense spheres. The comparison 
in Figure 11 (bottom panel) shows that ^jy,MW96 successfully approximates .^;y,2stp only when S^/S is 
small, and only over a limited range of values of z^/o". On the other hand, our result represents a 
direct analysis of the correlation function of halos, and at all parameter values it is fully consistent 
with the closely related bias ^,y,2stp of eq. (46). 

The abundance of halos and their correlations are determined by the linear power spectrum, 
since the formation of a halo is a long-term process that is driven by the existence of initial 
overdensities. However, correlation functions involving parts of halos depend also on the non- 
linear, internal structure of the halos themselves. Examples include the autocorrelation function 
of galaxies, which depends on the variation with halo mass of the number of galaxies per halo, 
and the dark matter autocorrelation function, which depends on the internal density profiles of 
halos. Prescriptions for the internal structure of halos have been previously combined with the 
halo correlation function based on Mo &; White (1996) to produce various autocorrelation functions 
(e.g., Seljak 2000; Ma &; Pry 2000). These analyses can now be improved with our generalized, 
self-consistent results for halo correlations. 
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4.3. Bivariate Mixed-mass Function 



The last quantity of interest is the bivariate mixed mass function, which is the mass function 
at one point times the correlated cumulative mass fraction at a second point, 



dniF2 



P 

Ml 



dM^ 



^{1^1,1^2,81,82,0 



(51) 



Using eqs. (37) and (38), this can be written as the sum of three terms: 



81-C 



dx [G{x, - G(2i^min - X, 0] G{ui -x,8i- erf ( 2^ 1 ' (52) 

J —00 



1^2(52-0 



where Qq is evaluated at Si = vi and 82 = i'2- 



In Figure 12 we plot this quantity, again normalizing by its limiting value at large distances, 



erfc 



1^1 



3/2 



exp 



25i 



(53) 



in order to emphasize its overall features. The plotted quantity equals the one-point mass function 
/(z^i, 5*1) of Ml halos times the total mass fraction, at a distance d away, in halos above mass M2. 
In this plot we restrict our attention to halos that are able to cool in the neighborhood of the halo 
of mass Mi; for illustration we assume that molecular hydrogen formation is inefficient, and that 
star formation occurs only in galaxies in which atomic cooling is efficient. This in turn requires a 
halo virial temperature of at least 10^ K, which sets a minimum value of M2 at each redshift. 

The figure shows that if Mi is much smaller than M2, then at small distances, the mass fraction 
above the cooling limit is zero, as no objects larger than Mi form at this point. At intermediate 
values of the distance, however, the presence of a small object is able to either lower or raise the 
cooled mass fraction depending on how common or rare the objects are. 

If M2 is a generalized progenitor of Mi, however, so that Mi > M2 and Z2> z\, then we can 
compare with the mixed mass function for two fully correlated points. 



erfc 



1^2 - 1^1 



^2(^2 - Si) 



1^1 



27r5i 



3/2 



exp 



2Si 



(54) 



which corresponds to the dashed lines in the figure. In the case in which the two objects form at the 
same redshift, this is simply the mass function at a single point, as at short distances the presence 
of an object of a size larger than the cooling mass indicates that all the gas at this point has cooled. 
In general, the presence of a halo with Mi > M2 can produce either a positive or negative overall 
bias in the amount of gas that cools within nearby halos, depending on d and on Mi. 
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Fig. 12. — Bivariate Mixed-mass function. Upper panel: each sold line shows J- normalized by 
its uncorrelated value, as a function of d, for various values of Mi. The lines are labeled by their 
Ml values and in all cases zi = Z2 = 4, and M2 is the cooling mass at this redshift, 2.6 x W^Mq. 
For reference, the fully uncorrelated line is given by the dotted line and the fully correlated case 
in which Mi > M2 is shown by the dashed line; if Mi < M2 then the fully correlated case is 
zero. Lower panel: Normalized J- for a higher redshift case in which zi = Z2 = 10, and M2 is the 
(smaller) cooling mass at this redshift, 7.9 x W'^Mq, which corresponds to slightly rarer objects. 
The lines follow the same conventions as in the upper panel. 
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5. Conclusions 

In this work, we have developed a simple but powerful extension of the Press & Schechter (1974) 
model that goes beyond Bond et al. (1991) and can be used to study inhomogeneous structure for- 
mation and correlation functions among halos. Although the linear excursion-set model represents 
the backbone of our modern analytical Tindcrstanding of structure formation, it has previously only 
been able to predict the average number density of collapsed objects and has not supplied any in- 
formation as to their relative positions. While many authors have derived approximate expressions 
for spatial correlations either analytically or from numerical simulations, each of these was grafted 
externally onto the underling formalism, leaving inhomogeneous structure formation on slippery 
theoretical ground. 

Although a strict excursion set description of inhomogeneous structure formation can be con- 
structed only by solving the diffusion equation numerically, we have shown in this work that such 
solutions can be matched to within 2% accuracy using a simple approximation. By considering the 
trajectories of the overdensities at two points to be at first fully correlated and later fully uncor- 
related as a function of decreasing filter scale, and by carefully choosing the cutoff value between 
these two regimes, we have developed an analytical formalism appropriate for studying problems 
in which a simple one-point approach is insufficient. 

With this two-step approximation we are able to derive an approximation to the joint probabil- 
ity distribution for the overdensities of two points with arbitrary mass scales and collapse redshifts, 
separated by any comoving Lagrangian distance. This distribution leads directly to an analytical 
expression for the bivariate mass function of halos, given by eqs. (41) and (42), and for the bivariate 
cumulative mass fraction, given by eqs. (44) and (45). These expressions generalize a number of 
previous results, and also yield self-consistent expressions for the nonlinear biasing and correlation 
function of halos, given by eqs. (46) and (48). Our results also incorporate halo exclusion, i.e., the 
fact that if point A belongs to a given halo, all nearby points B are likely to belong to the same halo 
(see, e.g.. Figure 7). We have also shown that the two-step approximation yields the right value in 
every limit where it must match a result derived from the one-point approach. We have provided 
Gemini (see Appendix B), a publicly-available code that makes it easy to apply our formalism in 
specific cases. 

Our results form an analytical framework that can be used in conjunction with numerical simu- 
lations to study inhomogeneous structure formation in a manner similar to the approach commonly 
adopted to study average quantities; analytical techniques are used to outline the overall physical 
picture and quantify model uncertainties, and numerical techniques are used to refine these results 
through a limited series of detailed tests. Likewise, these comparisons must take into account many 
of the issues that arise in a spatially averaged context, such as the best choice of collapse density Sc 
(e.g., Kitayama & Suto 1996; Sheth, Mo, & Tormen 2001; Jenkins et al. 2001), possible corrections 
to the overall functional form of the mass function (e.g., Jenkins et al. 2001; Lee & Shandarin 1998), 
and the relationship between the Eulerian coordinate system which is observed and the Lagrangian 
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coordinates that are used in the excursion-set formahsm (e.g., Mo k. White 1996; Catelan et al. 
1998; Jing 1999). While the study of such issues will sharpen the hnk between this formahsm and 
more directly observable quantities, our method nevertheless already provides an important first 
step towards a better theoretical understanding of inhomogeneous structure formation. In this con- 
text the Lagrangian coordinate system intrinsic to an excursion-set description is a mixed blessing, 
for although it is more difficult to compare with simulations, it is usually much more important 
physically to have a measure of the total column depth of material separating two objects than 
their precise distance. 

For many years the study of structure formation has centered on the formation of individual 
dark matter halos and their direct progenitors, yet many classes of problems that are now being 
studied can not be addressed in this context; and while it was once assumed that gravity alone 
controlled cosmic evolution, many more recent issues in structure formation are better described 
as an interplay between the IGM and the objects that form within it (e.g., Barkana & Loeb 1999; 
Ciardi, Ferrara, k. Abel 2000; Scannapieco, Thacker & Davis 2001). Each new generation of objects 
changes the state of the gas, and this state in turn affects the properties of the next generation to 
form. 

^From the dissociation of molecular hydrogen, to reionization and the resulting photoevapo- 
ration of halos, to the epoch of galactic winds, our universe has time after time undergone intense 
and inhomogeneous transformations. Although the intricacies of such processes will undoubtedly 
take years of observational and theoretical investigation to unravel, it is clear already that one's 
answers will depend on one's relative location. 

We thank Dick Bond, Yuri Levin, Avi Loeb, and Anatoly Spitkovsky for useful comments and 
discussions. ES has been supported in part by an NSF MPS-DRF fellowship. RB acknowledges 
support from CITA. 

Erratum 

Immediately preceding the publication of this article we were alerted to the existence of a 
previous investigation into inhomogeneous structure formation from the point of view of excursion 
sets. In Porciani et al. (1998), the authors considered the simultaneous evolution of two points 
separated by a fixed initial comoving distance, but rather than approximate this evolution with 
a two-step procedure such as the one described in §3.3 of Scannapieco & Barkana (2002) they 
instead made use of the correlated two-point solution in the absence of harriers and imposed on 
this the same reflecting boundary conditions as in the fully imcorrelated case. In this way they 
were able to develop what amounts to an alternate approximation to our eq. (41) in the limited 
case of simultaneously collapsing halos. 

While both approaches yield similar results at small values of^/S, the approximation proposed 
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by Porciani et al. breaks down when the two points are highly correlated. In order to show this we 
consider the limit in which I'l = 1^2 = Si = S2 = S and ^ S, and restrict our attention to the 
physical range in which 81,62 < i^- In this case our two-step approximation simply reduces to the 
single absorbing barrier solution as given by eq. (28), 

Q{u,6i,62,S) = [G{5i,S) - G{2u - 6i,S)]SDiSi - 62) , (55) 

where G{5,S) is a Gaussian of variance S in the variable S [as in eq. (4)], and 6d is the one- 
dimensional Dirac delta function. In this notation, the correlated solution with uncorrelated bound- 
ary conditions as per eqs. (33) and (34) of Porciani et al. gives instead 

Q{u, Si, S2, S) = [G{Si,S) + G{2u - Si, S)] Sd{Si - 62), (56) 

which amounts to an unphysical increase of probability density in the presence of absorbing barriers. 
This is because as ^ — >^ 5, the double reflection as in the uncorrelated case becomes an increasingly 
poor approximation to the single reflection appropriate for correlated diffusion. Indeed, Figure 3 of 
Porciani et al. shows that their approximation fails to reproduce a numerical Monte Carlo solution 
when the two points are separated by a small distance; their approximation for the clustering 
properties of dark matter halos is accurate only for separations that are larger than the Lagrangian 
halo size by a factor of ~ 3-10, with the exact value depending on the power spectrum of density 
fluctuations and the halo mass. 



Appendix A: Evaluation of the Bivariate Mass Function 



In this Appendix we give various analytic expressions that are necessary for the explicit eval- 
uation of eq. (41). Using eq. (31) to compute the derivatives, we find 

d^Q± 1 _ \ SIS2 + S^Si - 2Si52i' 
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(^2 - 0' 



(58) 



while to compute d^Q±/d5\ we simply switch the 1 and 2 indices in all terms of the previous 
equation. Finally, we also require 
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and 
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(59) 



(60) 



(61) 



^2 (5i-e)2 {S2-0' ' 

although in our standard cases this term is not needed as and ^rmax only functions of the 
smaller of any two given values Si and >S'2 [see eq. (41)]. 



Appendix B: Gemini, A Toolkit for Analytical Models of Inhomogeneous Structure 

Formation 

In order to facilitate applications of the analytical model of inhomogeneous structure formation 
discussed in this paper, we have made public a program Gemini that is intended to be a toolkit 
that can be used to quickly evaluate the most important quantities derived in this work. The 
program is divided into two main sections; the first, xiroutines, contains routines that compute 
preliminary quantities including the mass correlation function, in an arbitrary cosmology, and the 
second, gemini, uses these quantities to calculate the bivariate functions described in this work. 

The routines that calculate £^{r,d), -^{r,d), and cr2(r) need only be run once for any given 
cosmology, as xiroutines constructs a table of such values indexed as functions of ln(r) and ln(d — r) 
with arbitrary resolution. Subsequent calls to xiroutines then interpolate between these values using 
a cubic spline technique. 

The routines contained in gemini, on the other hand, are more fragmented, with the user 
determining which subroutines need to be included into the main program in order to calculate the 
quantities relevant to the problem at hand. In this case no tabulation is used and the bivariate 
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mass function, /, is computed analytically. The cumulative mass fraction and mixed mass function, 
on the other hand, require a single numerical integration. These are carried out as a function of 
the following four variables, 



^1 



V2-VI Si-$, 

u= , v= W- 

z^i y 82-$. 



(62) 



which are chosen to have simple values in common limits such as Si = S2 or vi = 1/2- In terms of 
these quantities and in the case in which vi < U2 



F{i'i,i^2,Si,S2,0 = J- J{s,t,u,v) , 



where 



J(s, t, u, V 



Jo 



dx 



g-s(t-a;)2 _ g-s(t+a;)2 



erf (x) eri[v{tu + x)] 



(63) 



(64) 



while the case in which i^i < V2, is calculated by simply switching the 1 and 2 indices in eq. (63). 
Finally, the x integral for the mixed-mass function JT can be written as 
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(65) 

(66) 
(67) 



In this case the lower bound on the integration is defined as w = max(0, iit). 



Gemini is available at the web address http : //www . arcetri . astro . it/~evan/Gemini/ which 
also contains more detailed documentation regarding its use. 
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